Reinforcement Learning in Partially Observable Markov Decision Processes using Hybrid Probabilistic Logic Programs

نویسنده

  • Emad Saad
چکیده

We present a probabilistic logic programming framework to reinforcement learning, by integrating reinforcement learning, in POMDP environments, with normal hybrid probabilistic logic programs with probabilistic answer set semantics, that is capable of representing domain-specific knowledge. We formally prove the correctness of our approach. We show that the complexity of finding a policy for a reinforcement learning problem in our approach is NP-complete. In addition, we show that any reinforcement learning problem can be encoded as a classical logic program with answer set semantics. We also show that a reinforcement learning problem can be encoded as a SAT problem. We present a new high level action description language that allows the factored representation of POMDP. Moreover, we modify the original model of POMDP so that it be able to distinguish between knowledge producing actions and actions that change the environment.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Non Deterministic Logic Programs

Non deterministic applications arise in many domains, including, stochastic optimization, multi-objectives optimization, stochastic planning, contingent stochastic planning, reinforcement learning, reinforcement learning in partially observable Markov decision processes, and conditional planning. We present a logic programming framework called non deterministic logic programs, along with a decl...

متن کامل

Probabilistic and Decision-Theoretic User Modeling in the Context of Software Customization

Research in the field of user modeling has aimed to supersede the current “one-size-fits-all” trend in software development that forces users to change their behaviour according to the preprogrammed functions. This paper discusses aspects of user modeling and its relevance to software customization. In particular, we focus on user modeling techniques that utilize probabilistic and decision-theo...

متن کامل

Sequential Constant Size Compressors for Reinforcement Learning

Traditional Reinforcement Learning methods are insufficient for AGIs who must be able to learn to deal with Partially Observable Markov Decision Processes. We investigate a novel method for dealing with this problem: standard RL techniques using as input the hidden layer output of a Sequential Constant-Size Compressor (SCSC). The SCSC takes the form of a sequential Recurrent Auto-Associative Me...

متن کامل

Title:clipp: Combining Logical Inference and Probabilistic Planning

Planning on mobile robots deployed in complex real-world application domains is a challenge because: (a) robots lack knowledge representation and common sense reasoning capabilities; and (b) observations from sensors are unreliable and actions performed by robots are non-deterministic. In this talk, I shall describe a hybrid framework named CLIPP that combines answer set programming (ASP) and h...

متن کامل

Reinforcement Learning for Problems with Hidden State

In this paper, we describe how techniques from reinforcement learning might be used to approach the problem of acting under uncertainty. We start by introducing the theory of partially observable Markov decision processes (POMDPs) to describe what we call hidden state problems. After a brief review of other POMDP solution techniques, we motivate reinforcement learning by considering an agent wi...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/1011.5951  شماره 

صفحات  -

تاریخ انتشار 2010